{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Autograd: 自动微分\n",
    "\n",
    "这里理解起来会比较费劲，如果一遍不能理解，建议多读几遍反复推敲一下。\n",
    "\n",
    "`autograd` 包是 PyTorch 中所有神经网络的核心。首先让我们简要地介绍它，然后我们将会去训练我们的第一个神经网络。该 `autograd` 软件包为 `Tensors` 上的所有操作提供自动微分。它是一个由运行定义的框架，这意味着以代码运行方式定义你的后向传播，并且每次迭代都可以不同。我们从 `tensor` 和 `gradients` 来举一些例子。\n",
    "\n",
    "## Tensor（张量）\n",
    "\n",
    "`torch.Tensor` 是包的核心类。如果将其属性 `.requires_grad` 设置为 `True`，则会开始跟踪针对 `tensor` 的所有操作。完成计算后，您可以调用 `.backward()` 来自动计算所有梯度。该张量的梯度将累积到 `.grad` 属性中。\n",
    "\n",
    "\n",
    "要停止 `tensor` 历史记录的跟踪，您可以调用 `.detach()`，它将其与计算历史记录分离，并防止将来的计算被跟踪。\n",
    "\n",
    "\n",
    "要停止跟踪历史记录（和使用内存），您还可以将代码块使用 `with torch.no_grad():` 包装起来。在评估模型时，这是特别有用，因为模型在训练阶段具有 `requires_grad = True` 的可训练参数有利于调参，但在评估阶段我们不需要梯度。\n",
    "\n",
    "还有一个类对于 autograd 实现非常重要那就是 Function。Tensor 和 Function 互相连接并构建一个非循环图，它保存整个完整的计算过程的历史信息。每个张量都有一个 .grad_fn 属性保存着创建了张量的 Function 的引用，（如果用户自己创建张量，则g rad_fn 是 None ）。\n",
    "\n",
    "如果你想计算导数，你可以调用 `Tensor.backward()`。如果 Tensor 是标量（即它包含一个元素数据），则不需要指定任何参数 `backward()`，但是如果它有更多元素，则需要指定一个 `gradient` 参数来指定张量的形状。"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "metadata": {
    "collapsed": false,
    "jupyter": {
     "outputs_hidden": false
    }
   },
   "outputs": [],
   "source": [
    "import torch"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "创建一个张量，设置 requires_grad=True 来跟踪与它相关的计算"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "metadata": {
    "collapsed": false,
    "jupyter": {
     "outputs_hidden": false
    }
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "tensor([[1., 1.],\n",
      "        [1., 1.]], requires_grad=True)\n"
     ]
    }
   ],
   "source": [
    "x = torch.ones(2, 2, requires_grad=True)\n",
    "print(x)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "针对张量做一个操作"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "metadata": {
    "collapsed": false,
    "jupyter": {
     "outputs_hidden": false
    }
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "tensor([[3., 3.],\n",
      "        [3., 3.]], grad_fn=<AddBackward0>)\n"
     ]
    }
   ],
   "source": [
    "y = x + 2\n",
    "print(y)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "y 作为操作的结果被创建，所以它有 `grad_fn` 。"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "metadata": {
    "collapsed": false,
    "jupyter": {
     "outputs_hidden": false
    }
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "<AddBackward0 object at 0x000001787ADEDE80>\n"
     ]
    }
   ],
   "source": [
    "print(y.grad_fn)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "针对 y 做更多的操作："
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 5,
   "metadata": {
    "collapsed": false,
    "jupyter": {
     "outputs_hidden": false
    }
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "tensor([[27., 27.],\n",
      "        [27., 27.]], grad_fn=<MulBackward0>) tensor(27., grad_fn=<MeanBackward0>)\n"
     ]
    }
   ],
   "source": [
    "z = y * y * 3\n",
    "out = z.mean()\n",
    "\n",
    "print(z, out)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    ".requires_grad_( ... ) 会改变张量的 requires_grad 标记。输入的标记默认为 False 。"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 6,
   "metadata": {
    "collapsed": false,
    "jupyter": {
     "outputs_hidden": false
    }
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "False\n",
      "True\n",
      "<SumBackward0 object at 0x00000178757355B0>\n"
     ]
    }
   ],
   "source": [
    "a = torch.randn(2, 2)\n",
    "a = ((a * 3) / (a - 1))\n",
    "print(a.requires_grad)\n",
    "\n",
    "a.requires_grad_(True)\n",
    "print(a.requires_grad)\n",
    "\n",
    "b = (a * a).sum()\n",
    "print(b.grad_fn)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Gradients（梯度）\n",
    "\n",
    "我们现在后向传播，因为`out`包含了一个标量，`out.backward()` 等同于 `out.backward(torch.tensor(1.))` 。"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 7,
   "metadata": {
    "collapsed": false,
    "jupyter": {
     "outputs_hidden": false
    }
   },
   "outputs": [],
   "source": [
    "out.backward()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "打印梯度 d(out)/dx"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 8,
   "metadata": {
    "collapsed": false,
    "jupyter": {
     "outputs_hidden": false
    }
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "tensor([[4.5000, 4.5000],\n",
      "        [4.5000, 4.5000]])\n"
     ]
    }
   ],
   "source": [
    "print(x.grad)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "You should have got a matrix of ``4.5``. 我们把 ``out``\n",
    "*Tensor* 的值定为：“$o$”.\n",
    "将会得到这样的计算公式 $o = \\frac{1}{4}\\sum_i z_i$,\n",
    "$z_i = 3(x_i+2)^2$ and $z_i\\bigr\\rvert_{x_i=1} = 27$.\n",
    "Therefore,\n",
    "$\\frac{\\partial o}{\\partial x_i} = \\frac{3}{2}(x_i+2)$, hence\n",
    "$\\frac{\\partial o}{\\partial x_i}\\bigr\\rvert_{x_i=1} = \\frac{9}{2} = 4.5$."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "还可以用 `autograd` 做很事情"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 9,
   "metadata": {
    "collapsed": false,
    "jupyter": {
     "outputs_hidden": false
    }
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "tensor([ 786.6164, 1688.7915,  530.0458], grad_fn=<MulBackward0>)\n"
     ]
    }
   ],
   "source": [
    "x = torch.randn(3, requires_grad=True)\n",
    "\n",
    "y = x * 2\n",
    "while y.data.norm() < 1000:\n",
    "    y = y * 2\n",
    "\n",
    "print(y)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 10,
   "metadata": {
    "collapsed": false,
    "jupyter": {
     "outputs_hidden": false
    }
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "tensor([1.0240e+02, 1.0240e+03, 1.0240e-01])\n"
     ]
    }
   ],
   "source": [
    "gradients = torch.tensor([0.1, 1.0, 0.0001], dtype=torch.float)\n",
    "y.backward(gradients)\n",
    "\n",
    "print(x.grad)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "你可以通过将代码放在 `with torch.no_grad()`，来停止对从跟踪历史中的 `.requires_grad=True` 的张量自动求导。"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 11,
   "metadata": {
    "collapsed": false,
    "jupyter": {
     "outputs_hidden": false
    }
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "True\n",
      "True\n",
      "False\n"
     ]
    }
   ],
   "source": [
    "print(x.requires_grad)\n",
    "print((x ** 2).requires_grad)\n",
    "\n",
    "with torch.no_grad():\n",
    "    print((x ** 2).requires_grad)"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3 (ipykernel)",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.8.10"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 4
}
